Fast Learning VIEWNET Architectures for Recognizing 3-D Objects from Multiple 2-D Views

نویسندگان

Stephen Grossberg

Gary Bradski

چکیده

The recognition of 3-D objects from sequences of their 2-D views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system that classifies the preprocessed representations into 2-D view categories whose outputs arc combined into 3-D invariant object categories, and a working memory that makes a 3-D object prediction by a.ccurnulating evidence from 3-D object category nodes as multiple 2-D views are experienced. 'I'hc simplest VIEWNE'I' achieves high recognition scores without the need to explicitly code the tcmpora.l order of 2-D views in working memory. Working memories are also discussed that save memory resources by implicitly coding temporal order in terms of the relative activity of 2-D view category nodes, rather than as explicit 2-D view transitions. Varia.nts of the VIEWNE'I' architecture may also be used for scene understanding by using a preprocessor and classifier that can determine both What objects arc in a scene and Where they arc located. 'I'hc present VIEWNE'f preprocessor includes the COHT-X 2 filter, which discounts the illurnina.nt, regularizes and completes figural boundaries, and suppresseo image noise. 'I'his boundary segmentation is rendered invariant. under 2-D t.ra.nsla.t.ion, rotation, and dilation by use of a log-polar transform. The invariant spectra undergo Gaussian coarse coding to further reduce noise and 3-D foreshortening effects, and to increase generalization. 'I'hcsc compressed cocleo are input into the classifier, a supervised learning system baocd on the fuzzy AJ(I'iVIAP algorithm. Fuzzy AltfMAP learns 2-D view categories that are invariant under 2-D image translation, rotation, and dilation a.s well as 3-D image tra.noformations that do not cause a. predictive error. Evidence from scquenceo of 2-D view categories converges at :J-D object nodes that generate a responoc invariant under changes of 2-D view. 'I'hese 3-D object nodes input to a working mernory that accumulateo evidence over time to irnprove object recognition. ln the oimplest working rnernory, each occurrence (nonoccurrence) of a. 2-D view category increa.oeo (decreases) the correoponding nocle'0 activity in working rncrnory, 'I'he rna.xirnally active node is used to predict the il-D object. Recognition is studied with noisy and clean ima.geo using slow ancl fast learning. Slow learning at the fuzzy AH'I'MAP rnap field is adapted to learn the conditional probability of the ~l-D object given the selected 2D view category. VIEWNET is clernonstrated on an MJ'I' Lincoln Laboratory database of l28x128 2-D views of aircraft with and without additive noise. i\ recognition rate of up to 90% is achieved with one 2-D view and of up to 98.5% correct with three 2-D views. 'I'hc properties of 2-D view and 3-D object category nodes are compared with those of cello in monkey inferotcmporal cortex.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Viewnet Architectures for Invariant 3-d Object Learning and Recognition from Multiple 2-d Views

3 ABSTRACT 3 The recognition of 3-D objects from sequences of their 2-D views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system (Fuzzy ARTMAP) that classiies the prepr...

متن کامل

Fast-learning VIEWNET architectures for recognizing three-dimensional objects from multiple two-dimensional views

-The recognition o f three-dimensional ( 3-D ) objects from sequences o f their two-dimensional ( 2-D ) views is modeled by a family o f self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. V IEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation o f an image, a supervised incremental learning system that...

متن کامل

Spatiotemporal information during unsupervised learning enhances viewpoint invariant object recognition.

Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning...

متن کامل

Utilizing Temporal Associations for View-based 3-d Object Recognition I. Utilizing Temporal Associations for Viewer Centered Representations Ii. Overview of the Recognition System

We propose an architecture for the recognition of three-dimensional objects on the basis of viewer centered representations and temporal associations. Motivated by biological ndings and by successful computational implementations we have chosen a viewer centered representation scheme. In contrast to other implementations , special attention is paid to the temporal order of the views, which prov...

متن کامل

Fast 3-D Object Recognition using Feature Based Aspect-Trees

Olaf Munkelt, Christoph Zierl Technische Universit at M unchen Institut f ur Informatik { Prof. Dr. B. Radig, 80290 M unchen, Germany fmunkelt,[email protected] Abstract This contribution focuses on the recognition of a priori known 3-D objects in single 2-D images. The underlying model is embedded in the domain of CADbased vision using a viewer-centered approach to generate ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Fast Learning VIEWNET Architectures for Recognizing 3-D Objects from Multiple 2-D Views

نویسندگان

چکیده

منابع مشابه

Viewnet Architectures for Invariant 3-d Object Learning and Recognition from Multiple 2-d Views

Fast-learning VIEWNET architectures for recognizing three-dimensional objects from multiple two-dimensional views

Spatiotemporal information during unsupervised learning enhances viewpoint invariant object recognition.

Utilizing Temporal Associations for View-based 3-d Object Recognition I. Utilizing Temporal Associations for Viewer Centered Representations Ii. Overview of the Recognition System

Fast 3-D Object Recognition using Feature Based Aspect-Trees

عنوان ژورنال:

اشتراک گذاری